Simplivariate Models: Ideas and First Examples

نویسندگان

  • Jos A. Hageman
  • Margriet M. W. B. Hendriks
  • Johan A. Westerhuis
  • Mariët J. van der Werf
  • Ruud Berger
  • Age K. Smilde
چکیده

One of the new expanding areas in functional genomics is metabolomics: measuring the metabolome of an organism. Data being generated in metabolomics studies are very diverse in nature depending on the design underlying the experiment. Traditionally, variation in measurements is conceptually broken down in systematic variation and noise where the latter contains, e.g. technical variation. There is increasing evidence that this distinction does not hold (or is too simple) for metabolomics data. A more useful distinction is in terms of informative and non-informative variation where informative relates to the problem being studied. In most common methods for analyzing metabolomics (or any other high-dimensional x-omics) data this distinction is ignored thereby severely hampering the results of the analysis. This leads to poorly interpretable models and may even obscure the relevant biological information. We developed a framework from first data analysis principles by explicitly formulating the problem of analyzing metabolomics data in terms of informative and non-informative parts. This framework allows for flexible interactions with the biologists involved in formulating prior knowledge of underlying structures. The basic idea is that the informative parts of the complex metabolomics data are approximated by simple components with a biological meaning, e.g. in terms of metabolic pathways or their regulation. Hence, we termed the framework 'simplivariate models' which constitutes a new way of looking at metabolomics data. The framework is given in its full generality and exemplified with two methods, IDR analysis and plaid modeling, that fit into the framework. Using this strategy of 'divide and conquer', we show that meaningful simplivariate models can be obtained using a real-life microbial metabolomics data set. For instance, one of the simple components contained all the measured intermediates of the Krebs cycle of E. coli. Moreover, these simplivariate models were able to uncover regulatory mechanisms present in the phenylalanine biosynthesis route of E. coli.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simplivariate Models: Uncovering the Underlying Biology in Functional Genomics Data

One of the first steps in analyzing high-dimensional functional genomics data is an exploratory analysis of such data. Cluster Analysis and Principal Component Analysis are then usually the method of choice. Despite their versatility they also have a severe drawback: they do not always generate simple and interpretable solutions. On the basis of the observation that functional genomics data oft...

متن کامل

Beyond first order logic: From number of structures to structure of numbers: Part II

We study the history and recent developments in nonelementarymodel theory focusing on the framework of abstractelementary classes. We discuss the role of syntax and semanticsand the motivation to generalize first order model theory to nonelementaryframeworks and illuminate the study with concrete examplesof classes of models. This second part continues to study the question of catecoricitytrans...

متن کامل

How Developmental Psychology and Robotics Complement Each Other

This paper presents two complementary ideas relating the study of human development and the construction of intelligent artifacts. First, the use of developmental models will be a critical requirement in the construction of robotic systems that can acquire a large repertoire of motor, perceptual, and cognitive capabilities. Second, robotic systems can be used as a test-bed for evaluating models...

متن کامل

Developing EPQ models for non-instantaneous deteriorating items

In this paper, the classical economic production quantity (EPQ) model is developed for non-instantaneous deteriorating items by considering a relationship between the holding cost and the ordering cycle length. Two models are developed. First, the proposed model is considered when backorders are not permitted and this condition is waived for the second case. The cost functions associated with t...

متن کامل

Statistical Models for Longitudinal Data Analysis

Longitudinal data analysis has become popular as one of statistical methods. In this paper we introduce four common statistical models for handling longitudinal data. First, we introduce what longitudinal data are and the purpose of doing such an analysis. Then, using SAS examples, we focus on acquiring more applicable skills and ideas of applying these statistical models to longitudinal data a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PLoS ONE

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2008